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Abstract 

Holroyd and Propp used Hall's marriage theorem to show that, 
given a probability distribution vr on a finite set S, there exists an 
infinite sequence si,S2, ■ ■ ■ in 5 such that for all integers k > 1 and 
all s in S, the number of i in [l,k] with Si = s differs from A;7r(s) 
by at most 1. We prove a generalization of this result using a simple 
explicit algorithm. A special case of this algorithm yields an exten- 
sion of Holroyd and Propp's result to the case of discrete probability 
distributions on infinite sets. [Note added in 2010: Since writing and 
posting this article on the arXiv, we have learned that both Theorem 
1 and Theorem 2 are in the literature; see the articles by Tijdeman 
that have been added to the bibliography.] 

Recently there has been an upsurge of interest in non-random processes 
that mimic interesting aspects of random processes, where the fidelity of the 
mimicry is a consequence of discrepancy constraints built into the construc- 
tions (for a general survey of discrepancy theory, see [1]). A recent example 
is the work of Friedrich, Gairing and Sauerwald [7] on load-balancing; other 
examples, linked by their use of the "rotor-router mechanism" , are the work 
of Cooper, Doerr, Friedrich, Spencer, and Tardos [21 |3l HI O [6] on derandom- 
ized random walk on grids ("P- machines"), the work of Landau, Levine and 
Peres O [TTl [121 US] on derandomized internal diffusion-limited aggregation 
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on grids and trees, and the work of Holroyd and Propp [5] on derandomized 
Markov chains. Here we focus on derandomizing something even more fun- 
damental to probabihty theory: the notion of an independent sequence of 
discrete random variables. 

Given a discrete probability distribution 7r(-) on a set S, the associ- 
ated i.i.d. process satisfies the law of large numbers. That is, if we choose 
5*1, 5*2, . . . from S independently at random in accordance with vr, the ran- 
dom variables Nk{s) := ^{i : 1 < i < k and Si = s} have the property 
that Nk{s)/k — )■ 7r(s) almost surely as /c — )■ oo, and indeed the discrepancies 
Nk{s) — k 7r{s) are typically 0{\/k). A derandomized analogue of an i.i.d. pro- 
cess should have the property that N^^s) — kn^s) is o{k), and derandomized 
processes with \Nf:{s) — kn^s)] as small as possible are especially interesting. 
It is too much to ask that the unsealed differences Nk{s) — k 7r(s) themselves 
go to zero (since Nk{s) is always an integer while kn^s) typically is not), but 
we can ask that these differences stay bounded. 

Holroyd and Propp [8] used such derandomized i.i.d. processes ("low- 
discrepancy stacks" , in their terminology) in order to make their theory ap- 
plicable to Markov chains with irrational transition probabilities. Indeed, the 
following theorem appears (with slightly different notation) as Proposition 
11 in [g. 

Theorem 1 (Low-discrepancy sequences for i.i.d. processes; [S]). Given a 
probability distribution ir on a finite set S, there exists an infinite sequence 
si,S2, ■ ■ ■ in S such that for all k > 1 and all s in S, the number of i in [1, k] 
with Si = s differs from k7i{s) by at most 1. 

Here we give a proof of this result that is simultaneously simpler and 
more constructive than Holroyd and Propp's, and applies even when the set 
S is infinite. Furthermore, our construction gives a simple way to deran- 
domize sequences of discrete random variables that are independent but not 
identically distributed. 

Theorem 2 (Low-discrepancy sequences for independent processes). Given 
discrete probability distributions 7ri,7r2,... on some countable set S, there 
exists an infinite sequence Si, S2, . . . in S such that for all k > 1 and all s 
in S, the quantities Nk{s) := ^{i : 1 < i < k and Si = s} and Pk{s) : = 
SiLi^il-^) differ in absolute value by strictly less than 1. 

Theorem [1] is the special case of Theorem [2] in which S is finite and tti = 
7r2 = ■ ■ ■ = vr, with an infinitesimally weaker inequality in the conclusion. 
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Remark. For every choice of Si, S2, • • • , we have Nk{s) = k = J2s Pk{s), 
whence J2si-^k{s) — Pk{s)) = 0. Moreover, if we were to choose 5*1, S2, ■ ■ ■ 
independently from S in accordance with the respective probabihty distri- 
butions TTj, then the expected number of i in [1,/c] with Si = s would be 
X^i=i so the expected value of Nk{s) — Pk{s) would be zero for each s. 

Note: It has come to our attention that Theorem 2 is not new; the same 
result (with the same proof) was discovered by Tijdeman in the 1970s (see [H] 
and [E]). Indeed, Tijdeman's result is slightly stronger than ours in the case 
where S is finite (he obtains the optimal constant, which is slightly smaller 
than 1 and depends on the size oi S). 

Proof of Theorem We present an algorithm for determining the sequence 
(sfc). Our algorithm is as follows. Given si, . . . ,Sk (with k > 0), let Sk+i be 
the candidate s with the earliest deadline, where we say s is a candidate 
(for being the {k + l)st term) if Nk{s) — Pk+i{s) < 0, and where we define 
the deadline for such an s as the smallest integer k' > k + 1 for which 
Nk[s) — Pk'{s) < —1. Ties may be resolved in any fashion. 

For > write Dk{s) := Nk{s) - Pk{s) (note that Do{s) = 0). First 
observe that s is a candidate if and only if taking s^+i = s would lead to 
Dk+i{s) < 1; equivalently, s fails to be a candidate if and only if taking s^+i = 
s would lead to i5fc+i(s) > 1. That is, when we are choosing the (/c+l)st term 
of the sequence, s fails to be a candidate if and only if choosing s^+i to be s 
would cause s to be oversampled from time 1 to time k + 1. It is clear that 
there is always at least one candidate, since ^j,(A^fc(s) — Pk+i{s)) = — 1 < 0. 

Also note that if s is a candidate with deadline k', then taking s^+i, . . . ,Sk' 
all unequal to s would lead to -Dfc'(s) < —1; that is, such an s would be 
undersampled from time 1 to time k' if it were not chosen to be at least 
one of Sfc+i, ...,Sk'. 

For k' > k > define Rk,k'{s) := [Pk'{s) - A^fc(s)J+ (where x+ := 
max(x, 0)). Thus Rk,k'{s) is the minimal number of the terms s^+i, . . . ,Sk' 
that must be equal to s in order to prevent s from being undersampled 
from time 1 to time k'. If Rk^^^i^s) = 1, then s^+i must be chosen to 
equal s in order to prevent s from being undersampled from time 1 to time 
k + 1] we call such an s critical. We will show by induction on k > 
that -RA:,fc'(s) < k' — k for all k' > k. This implies in particular that 
'^sRk,k+i{s) < 1 for all k, so that at each step at most one s is critical; this 
in turn implies that no s is ever undersampled. And since our procedure only 
chooses candidates, no s is ever oversampled either. 
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First, consider k = 0: we have Dq{s) = and i?o,fc'('5) = [Pk'{s)\, so 
Es^o,fe'(s) = J2slPk'is)\ < lEsPk'is)] = A;' - as claimed. 

Now take k > 0, and suppose for induction that Rk,k'{s) < k' — k 
for some particular k' > k + 1. We wish to show that -Rfc+i.fcK-^) — 
k' — {k + 1). There are two cases to consider. First, if i?^ ^/(sfc+i) > 0, then 
Rk+i,k'{sk+i) = Rk,k'{sk+i) - 1 and Rk+i,k'{s) = Rk,k'{s) for all s ^ s^+i, so 
EsRk+iA^) = iEsRkA^)) -l<{k'-k)-l = k'-{k + l) as claimed. 
Second, if Rk,k'{sk+i) = 0, then the deadline for Sk+i is greater than k'. 
Since our algorithm chooses s^+i as the (fc + l)st term, s^+i must be the 
candidate with the earliest deadline. This means that no s 7^ s^+i in S with 
Rk,k'{s) > is a candidate; that is, every s 7^ s^+i with Rk,k'{s) > must sat- 
isfy Nk{s) > Pfc+i(s), and since Nk^i{s) = Nk{s), we must have Z)fc+i(s) > 0, 

implying = [-Dk+i{s) + J2iLk+2^iis)\'^ < [J2^'=k+2^ii^'^)\~^ < 

Ylt=k+2 '^ii^)- Likewise, for all s with Rk,k'{s) = 0, we have Rk+i,k'{s) = 0, im- 
plying Rk+i,k'{s) < Ef=fc+2^i(^)- Hence Es^fc+i,fc'(5) < E?lfc+2 ^*(^) = 
Etfc+2 E« Ms) = Elfc+2 I = k' - (k + l), as claimed. □ 

In reading the proof, the reader may find it helpful to imagine the fol- 
lowing scenario. Let iS be a set of creatures, each of which has a surplus 
(or "energy level") that is initially 0. At each step a single creature gets fed. 
At the kth step, the surplus Dk{s) of creature s decreases by vrfc(s), but in 
addition increases by 1 (giving a net change of 1 — 7rfc(s)) if s gets fed. After 
each step the sum of the surpluses is zero. If a creature's surplus ever falls 
to —1 or less, the creature dies of starvation; if its surplus ever rises to 1 or 
more, it dies of overfeeding. Our strategy for keeping all the creatures alive is 
to always feed the creature that if left unfed would die earliest of starvation, 
excepting those that cannot be fed because they would immediately die of 
overfeeding. 

Independently of our work, in the context of a one-sided version of the 
discrepancy-control problem arising from an email post by John Lee [TUj . 
Oded Schramm and Fedja Nazarov considered other algorithms for keeping 
the quantities /c7r(s) — Nk{s) from becoming too large in the case where all 
the TTi equal it. 

Considering the sequence of discrepancy vectors (-Dfc) leads to the follow- 
ing reformulation of Theorem [T] 

Corollary 3. For any probability vector tt G there is a compact K C 
[—1,1]" containing (0,0,..., 0) such that K C \J^^^{K + vr — Cj), where 
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{ei, . . . , e„} is the standard basis o/M". 

To see that this imphes Theorem [H note that, given K and Dk (with 
k > 0), we can choose s^+i = i such that — n + Ci G K; this choice 
guarantees that the discrepancy vector Z)fc+i hes in K and hence in [—1, 1]". 
Conversely, note that if we take the sequence Si, S2, ■ ■ ■ given by Theorem 
[H the bounded set {Dk '■ k > 0} satisfies all the conditions of Corollary [3] 
except for compactness. Hence we can prove Corollary [3] by taking K to be 
the closure of {D^ : k > 0}. Corollary [3] asserts the existence of a set K 
containing the origin that is covered by translations of itself by given vectors 
Vi := TT — Cj. Since the zero vector is in the convex hull of the Vi, a sufficiently 
large ball in the subspace spanned by the Vi achieves this, but Corollary [3] 
provides the bound C [— 1, 1]". 

The constant 1 in Theorem [2] cannot be improved; that is, there is no 
c < 1 with the property that for all tti, 7r2, . . . there exists a way to choose 
5*1, 5*2, .. . from S so that the discrepancies Nk{s)/k — 7i{s) all stay within 
the interval (— c, +c). Consider for instance the case where each tTj is the 
uniform distribution on a finite set S of cardinality n, and take k = n — 1. 
There exists s in 5 distinct from si, . . . ,Sk and this s satisfies Pk{s) — Nk{s) = 
k/n—0 = 1 — 1/n. Although this example shows that the constant 1 cannot be 
improved, it is possible that there is a universal strict subset K of (—1, +1) x 
(—1, +1) X ■ ■ ■ with the property that for all vri, 7r2, . . . there exists a way of 
choosing Si, S2, ■ ■ ■ from S so that the vector of discrepancies stays within 
the set K. 

It is also possible that Theorem |2] might be strengthened by controlling 
the discrepancies between Nk{s) — Nj{s) and 7rj+i(s) + 7rj+2(s) + ■ ■ ■ + vrfc(s) 
for all j, k with 1 < j < k and all s in S. It is easy to deduce from Theorem 
|2]that every such discrepancy has absolute value less than 2 (since it is just 
Dk{s) —Dj{s)), but perhaps one can show that there exists a way of choosing 
^i, 5*2, . . . so that every such discrepancy has absolute value less than 1. 

In determining Sk, our algorithm typically requires knowledge of the fu- 
ture distributions {7Ti)i>k (actually a finite but unbounded number of them). 
This is unavoidable, as may be seen by the following example. Let tt^ be 
uniform on 1, . . . , 5 for each of = 1, 2, 3. Regardless of si, S2, S3 there will 
be two i's with N^i^i) — P^ii) = —0.6. If Si, S2, S3 are chosen in ignorance of 
7r4, it is possible for to be uniform on those two i's. Then any choice of 
S4 will result in iV4(i) — P4{i) = —1.1 for some i. With more than 5 values 
the discrepancies can be even larger in magnitude. It might be interesting to 
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know how good a bound on discrepancy can be achieved by algorithms that 
are constrained to have s„ depend only on vti, . . . , 7r„. Of course, when the 
TTj's are all equal as in Theorem [1] this issue is nonexistent. 

One way in which Theorem [1] might be strengthened is by finding a con- 
struction that minimizes max^ Y2i<k fi^i)^ where / is some function on S sat- 
isfying /(s)7r(s) = 0. Sums of the form max^ ^^^^ /(-^i) play an impor- 
tant role in [S] ; there the elements s of iS correspond to transitions n — )■ i; in a 
Markov chain (with u fixed and v varying), 7r(s) equals the transition proba- 
bility p{u, v), and f{s) equals h{v) — h{u) where the function h{-) is harmonic 
at u (i.e., '^i,p{u,v)h{v) = h{u)), implying f{s)n{s) = 0. For example, 
consider random walk on Z^, where a vertex u has four neighbors u^, Us, Ue, 
and uw with p{un) = p{us) = p{ue) = p{uw) = \- Key results in [8] treat 
rotor-routers that rotate in the repeating pattern N, E, S, W, N, E, S,W, . . . 
and show that the resulting rotor-router walks closely mimic certain fea- 
tures of the random walk. However, since the relevant discrepancies are 
controlled by quantities of the form maxfc ^^^^ /(sj), and since the func- 
tion h has the property that h{uN) + h{us) and h{uE) + h{uw) are close to 
2h{u) (implying that f{N) + f{S) and f{E) + f(W) are close to zero), there 
is reason to think that rotor-routers that rotate in the repeating pattern 
A^, S, E, W, N, S, E,W, . . . would give smaller discrepancy for the quantities 
of interest. 

As an important special case of Theorem [H suppose n is rational. It is 
then natural to ask that the sequence si,S2, ■ ■ ■ be periodic (so that one has 
a "rotor" in the sense of [8]). Our algorithm as described does not guarantee 
periodicity, because we allowed ties to be broken arbitrarily. However, if 
we add the stipulation that ties are always broken in some pre-determined 
way depending only on the deadlines and the discrepancies Dk{s), then our 
algorithm yields a periodic sequence, with period equal to the least common 
multiple m of the denominators of the rational numbers vr(s). Indeed, it is 
clear that the sequence generated by the algorithm is eventually periodic, and 
that the period cannot be less than m. On the other hand, the construction 
gives \Nm{s) — Pm{s)\ < 1; but Nm{s) and Pm{s) are both integers, so they 
must be equal. Hence Dm{s) = = Dq{s) for all s, so the procedure enters 
a loop at time m. 

Theorem [1] can be rephrased as follows: for any sequence of non- negative 
real numbers 7r(l), 7r(2), . . . summing to 1, there exists a partition of the 
natural numbers into sets Fi, F2, . . . where for all i the set Fi has density vr(z), 
and \Fi fl {1, . . . ,k}\ — TT{i)k lies in [—1, 1] for all A; > 1 (of course the second 
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property implies the first). An even stronger condition we might seek is that 
the gap between the mth and nth elements of Fi (for all i and all n > m > 1) 
is within 1 of (n — 'm)/7r{i). If "within 1" is interpreted in the strict sense 
(i.e., the difference is strictly less than 1), then this condition cannot always 
be achieved; e.g., with vr = (|, |, |), the only way to satisfy the condition 
would be to partition the natural numbers into three arithmetic progressions 
with densities ^, |, and |, which clearly cannot be done. However, if "within 
1" is interpreted in the weak sense (i.e., the difference is less than or equal 
to 1), then we do not know of a counterexample. 
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